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Abstract: 

We present a detailed comparison of the most recent sets of NNLO PDFs from the 
ABM, CT, HERAPDF, MSTW and NNPDF collaborations. We compare parton distri- 
butions at low and high scales and parton luminosities relevant for LHC phenomenology. 
We study the PDF dependence of LHC benchmark inclusive cross sections and differ- 
ential distributions for electroweak boson and jet production in the cases in which the 
experimental covariance matrix is available. We quantify the agreement between data 
and theory by computing the x 2 f° r each data set with all the various PDFs. PDF com- 
parisons are performed consistently for common values of the strong coupling. We also 
present a benchmark comparison of jet production at the LHC, comparing the results 
from various available codes and scale settings. Finally, we discuss the implications of the 
updated NNLO PDF sets for the combined PDF+a s uncertainty in the gluon fusion Higgs 
production cross section. 
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1 Introduction 



Parton distribution functions (PDFs) are one of the dominant sources of systematic uncer- 
tainty in many of the LHC cross sections relevant for Standard Model precision physics, 
Higgs boson characterization and new physics searches. The dependence of benchmark 
total cross sections on PDFs at the 7 TeV LHC was discussed in Refs. [DI2]- The purpose 
of the present paper is on the one hand to update these benchmark comparisons by in- 
cluding the most recent PDF sets from the various collaborations, and on the other hand 
to perform quantitative comparisons with 7 TeV data for differential distributions, and 
with 8 TeV data for inclusive cross sections. 

There have been several new NNLO PDF releases since the previous benchmark stud- 
ies p. The ABM collaboration have released ABM11 [3], which supersedes ABKM9 [4j. It 
uses the combined HERA-I data, MS running heavy quark masses for DIS structure func- 
tions [5], and provides PDF sets for a range of values of a s in a fixed flavor number scheme 
with Nf = 5. The CT collaboration have recently released a CT10 NNLO PDF set [UJ, 
based on the same global dataset as CT10 NLO [7], and using a NNLO implementation 
of the S-ACOT-x variable flavor number scheme for heavy quark structure functions [5]. 
The HERAPDF collaboration have released the HERAPDF1.5 NNLO PDF set 0CEO], 
which in addition to the combined HERA-I dataset uses the inclusive HERA-II data from 
HI p] and ZEUS pfl . The latest release from NNPDF is the NNPDF2.3 [13] set. Like 
the previous NNPDF2.1 release this uses the FONLL VFNS at NNLO [14], and now also 
includes relevant LHC data for which the experimental correlation matrix is available. 
This is currently the only set which include LHC data in the fit. 

As in previous benchmarks, we also use the MSTW08 NNLO PDFs [15] , Although 
no new public release has been provided, several partial updates have been presented, 
discussing the impact on the MSTW08 PDFs of the combined HERA-I data and the 
Tevatron W lepton asymmetry [16] and of the LHC W lepton asymmetry data [T7J, and 
additionally the ATLAS W, Z and inclusive jet data in [15] . We do not include in this 
benchmark study the JR09 PDF set [19] because it is available only for a single value of 
a s (M z ). 

PDF sets will be compared consistently for a common value of a s . All the PDF sets 
included in this benchmark comparison provide a s variations in a relatively wide range, 
as summarized in Table [TJ We will show results for PDFs, parton luminosities, physical 
cross sections and x 2 values for a s (Mz) = 0.118 as a baseline, and whenever we want to 
study the effect of varying a s we will provide results for two values of a s (Mz), a s = 0.117 
and 0.119. The motivation for this choice is that these values approximately bracket the 
current 2012 PDG best fit value [20], a s {M z ) = 0.1184 ± 0.0007. They also include the 
preferred or best-fit a s values of CT, MSTW and NNPDF at NNLO [6j[2lH23]. When 
error sets are only provided at a single value of a s we will determine uncertainties at 
other values of a s by computing percentage uncertainties at the value of a s at which error 
sets are provided, and then applying the same percentage uncertainty to the central value 
computed for other a s values. For the PDF plots of Sect [2] only (but not for luminosities) 
the uncertainty shown on the plot for values of a s for which error sets are not available 
will be taken as the absolute PDF uncertainty computed at the a s value at which error 

1 Note however that the fit uses preliminary data which are not exactly the same as the final published 
data. 
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sets are provided: this is because relative uncertainties on PDFs become meaningless in 
regions where the PDF is very close to zero. 



PDF set 


Reference 


ai u) (NLO) 


a s range (NLO) 


ai U) (NNLO) 


a s range (NNLO) 


ABM11 Nf = 5 


® 


0.1181 


[0.110,0.130] 


0.1134 


[0.104,0.120] 


CT10 


i 


0.118 


[0.112,0.127] 


0.118 


[0.112,0.127] 


HERAPDF1.5 


mm 


0.1176 


[0.114,0.122] 


0.1176 


[0.114,0.122] 


MSTW08 




0.1202 


[0.110,0.130] 


0.1171 


[0.107,0.127] 


NNPDF2.3 


m 


all 


[0.114,0.124] 


all 


[0.114,0.124] 



Table 1: PDF sets used in this paper. We quote the value a s for which PDF uncertainties are 
provided, and the range in a s in which PDF central values are available (in steps of 0.001). For 
ABM11 the a s varying PDF sets are only available for the Nf = 5 PDF set. 



The structure of this paper is the following: in Sect. [2] we begin by comparing the 
various sets of NNLO PDFs and the associated parton luminosities, and discuss the sim- 
ilarities and differences between each of the sets. In Sect. [3] we compute predictions for 
LHC inclusive cross sections at 8 TeV, including Higgs cross sections. Finally in Sect. 2] we 
compare PDF predictions for all available LHC data at 7 TeV with experimental covari- 
ance matrix, and quantify the data theory agreement for each of the PDF sets. Then we 
turn to discuss in more detail the case of the ATLAS inclusive jet data in Sect. [5j where 
we compare different codes and theory scale settings for jet production. Finally in Sect. 
we discuss the implications of this benchmarking for the particular case of the Higgs cross 
section in gluon fusion and examine possible extensions of the current (2010) PDF4LHC 
recommendation. Then we conclude and discuss the prospects for future benchmarking 
studies in Sect. [7] A more technical appendix summarizes the issue of the dependence on 
the x 2 definition. 

All the above groups provide versions of the respective PDF sets both at NLO and at 
NNLO. In this paper we will show only the NNLO PDFs, for the particular values of a s 
mentioned above. We have however produced the results presented here also at NLO and 
for a wider range of a s values. The complete catalog of plots can be obtained online from 
HepForge: 

http : //nnpdf . hepf orge . org/html/pdf bench/catalog. 
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2 Parton distributions and parton luminosities 



In this section we compare PDFs and then parton luminosities between the various groups. 
For definiteness we show here comparisons only between PDFs and luminosities at NNLO 
for a s = 0.118. Results for several other values of a s and at NLO can be obtained from 
the catalog of plots on the HepForge website. 

2.1 Parton distributions 

We compare parton distributions at Q 2 = 25 GeV 2 , above the b quark threshold since 
ABM 11 only provide their Nf = 5 PDFs for a range of values of a s 1 For each PDF 
we compare first NNPDF2.3, CT10 and MSTW08, and then NNPDF2.3, ABM11 and 
HERAPDF1.5 (with NNPDF2.3 thus being used as a common reference). We consider 
PDF uncertainties only and not the a s uncertainty, except for the ABM11 PDFs, where 
the a s uncertainty is treated on a equal footing to the PDF parameters in the covariance 
matrix. The ABM 11 and HERAPDF results also include an uncertainty on quark masses, 
whereas other groups provide sets with a variety of masses. 

In Fig. [T]we show the total quark singlet PDF S(x, Q 2 ) = Ylt=i [<li( x > Q 2 ) + <7«( x > Q 2 )] ; 
both on a linear and on a logarithmic scale, while in Fig. [2] we show various gluon PDFs 
g(x,Q 2 ), also on linear and logarithmic scales. There is a good agreement between all 
the sets for the quark singlet, though the uncertainty band at small x is rather wider for 
NNPDF and HERAPDF. The gluons of CT10, MSTW and NNPDF are also in reasonable 
agreement: the PDF one-sigma uncertainty bands overlap for all the range of x. Differences 
are larger for ABM11. At small x the ABM11 gluon has much smaller uncertainties than 
other groups, even for x values where there is little constraint from the data, reflecting 
perhaps the more restrictive underlying PDF parametrization. At high x the ABM11 
gluon is smaller than that of CT, MSTW and NNPDF, though the uncertainty band 
overlaps that of HERAPDF in most places. For HERAPDF1.5 the gluon at large x has 
larger uncertainties due to the lack of collider data, while at small x it is close to the other 
PDF sets as expected, since in this region it is only the precise HERA-I data that provides 
any handle on the gluon. 

The total strangeness s + (x, Q 2 ) = s(x, Q 2 ) + s(x, Q 2 ) is shown on a logarithmic scale in 
Fig [3l HERAPDF1.5 is not included because it does not have an independent strangeness 
parametrization, as HERA data alone do not allow disentangling of the strange contribu- 
tion. The CT10 strange distribution is somewhat higher than that of other groups. The 
origin of this difference is not understood at present. Both theoretical studies and data 
from the LHC, both from electroweak vector boson production, and from the exclusive 
W + c data, should shed light on this issue in the future. First ATLAS data did give some 
indication on strangeness [24] at small x, but they are still not accurate enough [13] to 
lead to definite conclusions. 

Finally we compare non-singlet distributions: the nonsinglet triplet and the total va- 

2 The ABM11 PDFs are provided as FFN sets with different numbers of active flavours: Nf—3, 4 and 
5. For scales Q 2 below the charm threshold the Nf=3 set must be used, between the charm and bottom 
threshold the set should be used and above the bottom threshold it is the Nf=5 set to be used. Of 

all these various FFN sets, only those with Nf=5 are provided for a variety of a s values. 
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Figure 1: The quark singlet PDFs xS(x, Q 2 ) at Q 2 = 25 GeV 2 plotted versus a; on a linear scale 
(upper plots) and on a logarithmic scale (lower plots). The plots on the left show the comparison 
between NNPDF2.3, CT10 and MSTW08, while in the plots on the right we compare NNPDF2.3, 
HERAPDF1.5 and ABM11. All PDFs are shown for a common value of a s = 0.118. 

lence PDFs, respectively denned as 

T% = u + u — d — d 

V = u — u + d — d + s — s (1) 

in Fig. HI and the quark sea asymmetry As = d — u and the strangeness asymmetry s~ = 
s — s in Fig.0 There is reasonable agreement for T3 and V, except for ABM11, for which T3 
at large x is significantly higher than in the other sets. This is due to a larger u distribution 
in this region. The HERAPDF1.5 PDF uncertainties in T3 are rather larger, reflecting the 
fact that HERA data does not provide much information on quark flavor separation. All 
sets are in a broad agreement on the light sea asymmetry, apart from HERAPDF1.5, which 
does not include the Drell-Yan and electroweak boson production data and cannot separate 
u and d flavors. Only MSTW08 and NNPDF2.3 provide independent parametrizations of 
the strange asymmetry PDF and are in reasonable agreement within uncertainties. 

2.2 Parton luminosities 

Now we compare parton luminosities. At a hadron collider, all factorizable observables 
for the production of a final state with mass Mx depend on parton distributions through 
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Figure 2: Same as Fig. [IJ but for the gluon PDF. 
a parton luminosity, which, following Ref. |25j . we define as 

( M x) = \ t (^i,Mi) f s (r/ Xl ,M x ) , (2) 

where fi(x,M 2 ) is a PDF at a scale M 2 , and r = M\js. As the PDFs, all parton 
luminosities will be compared for a common value of the strong coupling a s = 0.118. The 
parton luminosities are displayed as ratios to the NNPDF2.3 set. We assume a center-of- 
mass energy of 8 TeV. 

The gluon-gluon and quark-gluon luminosities are shown in Fig. [61 and the quark- 
quark and quark-antiquark luminosities are shown in Fig. [71 There is a reasonably good 
agreement between the NNPDF2.3, MSTW08 and CT10 PDF sets for the full range of 
invariant masses. However, the PDF uncertainties increase dramatically at Mx > 1 TeV, 
relevant for searches and characterization of heavy particles. Future data from the LHC 
on high-£^T jet production and high-mass Drell-Yan process should be able to provide 
constraints in this region. Differences with other PDFs are more pronounced for the 
ABM11 and HERAPDF1.5 PDF sets. For HERAPDF1.5, there is generally an agreement 
in central values, but the uncertainty is rather larger in some x ranges, particularly for 
the gluon luminosity, but also to some extent for the quark-antiquark one. For ABM11 
instead, the quark-quark and quark-antiquark luminosity are systematically higher by over 
5% below 1 TeV, and above this the quark-antiquark luminosity becomes much softer than 
either NNPDF2.3 or MSTW08. The gluon-gluon luminosity becomes smaller than all the 
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Figure 3: The total strange PDFs xs + = x(s + s) at Q 2 = 25 GeV 2 . The plot on the left 
show the comparison between NNPDF2.3, CT10 and MSTW08, while in the plot on the right we 
compare NNPDF2.3 and ABM11; HERAPDF1.5 is not included as it does not have an independent 
parametrization of strangeness. 

other PDFs at high invariant masses, overlapping only with the very large HERAPDF1.5 
uncertainty. 

It is also useful to compare the relative PDF uncertainties in the parton luminosities. 
In Fig. [8] we show this relative PDF uncertainty for the quark-antiquark and gluon-gluon 
luminosities. Here we see clearly the much larger HERAPDF1.5 uncertainty. At high 
invariant mass, the uncertainty in the ABM11 gluon-gluon luminosity becomes smaller, 
despite the fact that this is an extrapolation region due to the scarcity of experimental 
data. 

The larger quark-antiquark luminosity from ABM 11 as compared to the other PDF 
sets could be inferred from the PDF comparison plots at lower Q 2 : the ABM gluon is a 
little larger than the central value of the other groups below about x = 0.05, and this drives 
more quark and antiquark evolution at small x values. It has been recently suggested [26], 
based on results of a NLO fit to DIS data only, that some of these features could be at least 
in part the consequence of the ABM treatment of heavy quark contributions (see also |27j). 
Indeed, while CT, MSTW and NNPDF use a variable flavour number scheme [8]l 144128] . 
ABM11 uses a fixed flavour number scheme for heavy-quark PDFs. This may explain 
the increase in the medium- x and small- £ light quarks and gluons, and the corresponding 
softer large- x gluon required by the momentum sum rule, found in the ABM fits [26] . 
though more studies would be required in order to conclusively establish this. 

As an alternative explanation, a higher twist contribution has been invoked to explain 
part of the differences between ABM11 and the other PDF groups. While ABM fit a higher 
twist contribution, all groups minimize the impact of higher twists by suitable kinematic 
cuts in Q 2 and W 2 = Q 2 (1/x - 1). The HERAPDF fit includes no data at low W 2 , so 
that no cut is required. In addition, NNPDF2.3 includes exactly kinematical target mass 
corrections [29] , known to be a substantial part of the higher twist corrections. 

The kinematical cuts Q^m ano ^ ^min applied to the fitted DIS data sets are summarized 
for each group in Table [2] (the value of the scale Qq where the PDFs are parametrized is 
also shown for completeness). It should be observed that the ABM11 fit also imposes an 
upper cut Q max = 10 3 GeV 2 on the HERA data. Stability under variation of the default 
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Figure 4: Same as Fig. Q]for the non singlet triplet xT^{x) and the total valence xV(x) PDFs 
defined in Eq. flTJ. 

MSTW08 kinematical cuts was studied in Ref . [30] . The inclusion of higher twists in MRST 
fits has previously been shown to lead to only a small effect on high-Q 2 PDFs [314132] , and 
an ongoing extension of the study in [26j suggests this is qualitatively the same with more 
up-to-date PDFs. This conclusion has been confirmed in similar studies by NNPDF. 
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Figure 5: Same as Fig. |4] for the the sea asymmetry a; As = x(d — u) and the strange asymmetry 
xs~ — x(s — s). In the latter case we show only the results for MSFW08 and NNPDF2.3, the only 
PDF sets that introduce an independent parametrization of the strangeness asymmetry. 





Ql [GeV^] 


<&» [GeV^] 


W2la t GeV "] 


ABM11 


9 


2.5 


3.24 


CT10 


1.69 


4.0 


12.25 


HERAPDF1.5 


1.9 


3.5 




MSTW08 


1 


2.0 


15.0 


NNPDF2.3 


2.0 


3.0 


12.5 



Table 2: Kinematical cuts in Q 2 and W 2 = Q 2 (1/x — 1) applied to DIS data in various PDF 
determinations. The scale Qg at which PDFs are parametrized is also shown. For ABM11 there 
is also a maximum Q 2 < 1000 GeV 2 cut. 
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NT M x 10 J 
LHC 8 TeV - Ratio to NNPDF2.3 NNLO - a. = 




LHC 8 TeV - Ratio to NNPDF2.3 NNLO - a, = 0.118 




LHC 8 TeV - Ratio to NNPDF2.3 NNLO - ot s = 0.118 




Figure 6: The gluon-gluon (upper plots) and quark-gluon (lower plots) luminosities, Eq. ©, for 
the production of a final state of invariant mass Mx (in GeV) at LHC 8 TeV. The left plots show 
the comparison between NNPDF2.3, CT10 and MSTW08, while in the right plots we compare 
NNPDF2.3, HERAPDF1.5 and MSTW08. All luminosities are computed at a common value of 
a s = 0.118. 



11 




12 



0.3r 



# 0.2- 



LHC 8 TeV - Relative PDF uncertainty -a s = 0.1 1 8 

i 1 1 i 1 

— - NNPDF2.3 NNLO 

CT1 NNLO 

■-- MSTW2008 NNLO 



OAS 



0- 



-0.1, 



-0.2- 



-0.3 L 

0.3r 
0.2; 
0.1; 

0; 
-0.1; 
-0.2 1 



11T M x 10 J 
LHC 8 TeV - Relative PDF uncertainty -a s = 0.118 

i 1 1 i 

— - NNPDF2.3 NNLO 
CT1 NNLO 

■-- MSTW2008 NNLO / / - 



-0.3 L 



10 2 My 



10 3 



0.3r 



# 0.2 



0.1 V 



0- 



-0.1, 



LHC 8 TeV - Relative PDF uncertainty -a s = 0.118 

i 1 1 i ^ 

NNPDF2.3 NNLO / 

ABM11 NNLO / 

- - HERAPDF1 .5 NNLO / 



-0.2- 



-0.3 L 



0.3r 



0.2- 



0.1- 



0- 



-0.1 



-0.2- 



11T M x 10^ 
LHC 8 TeV - Relative PDF uncertainty -a s = 0.118 



-- NNPDF2.3 NNLO 
-■ ABM11 NNLO 

HERAPDF1 .5 NNLO 



-0.3 L 
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Figure 8: The relative PDF uncertainties in the quark-antiquark luminosity (upper plots) and in 
the gluon-gluon luminosity (lower plots), for the production of a final state of invariant mass Mx 
(in GeV) at the LHC 8 TeV. All luminosities are computed at a common value of a s = 0.118. 
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3 LHC inclusive cross sections 



In this section we compute inclusive cross sections at 8 TeV for various benchmark pro- 
cesses and compare the results for all NNLO PDF sets. We consider electroweak gauge 
boson production, top quark pair production and Higgs boson production in various chan- 
nels. We will provide results for a s = 0.117 and a s = 0.119. The Higgs case is discussed in 
more detail in Sect. [6j together with the interplay between the PDF and a s uncertainties. 
For these inclusive benchmark cross sections, we use the following codes and settings: 

• Higgs boson production cross sections in the gluon fusion channel have been com- 
puted at NNLO with the iHixs code [33]. The central scale has been taken to be 
Q = mn, to be consistent with the Higgs cross section working group recommenda- 
tions [33] • In all the Higgs production cross sections, we take ran = 125 GeV. 

• Higgs production in the Vector Boson Fusion (VBF) channel has been computed at 
NNLO with the VBFSNNLO code [35]. 



Higgs production in association with W and Z bosons has been computed at NNLO 
with the VHSNNLO program 



Higgs production in association with a top quark pair, ttH, has been computed at 
LO with the MCFM program [38] . 



Electroweak gauge boson production has been computed at NNLO using the Vrap 



code [39]. The central scale is Q 2 = M?; 



Top quark pair production has been computed at NNLO a p pr0 x+NNLL with the 
top++ code [3D], including the latest development of the calculation of the complete 
NNLO corrections to the qq — > tt production, documented in [41], as implemented 



in vl.3. The central scale is Q 2 = m 2 . The settings of the theoretical calculations 
are the default ones in Ref. [42]. In all calculations we use rat = 173.2 GeV. 

We begin with the Higgs production cross sections. Results at 8 TeV for all relevant 
production channels and different PDF sets and as(Mz) values have been collected in 
Table [3l In all cases the same value of as is used consistently in both the PDFs and in 
the matrix element calculation. Results are also represented graphically in Fig. [9j Note 
that the error bands shown correspond to the PDF uncertainty only, with the exception 
for ABM11 and, to a lesser extent HERAPDF, already mentioned. 

The main features which emerge from the plots are the following: 

• The relative sizes of the cross sections obtained using different PDF sets are almost 
independent of a s : when a s is varied all cross sections get rescaled by a comparable 
amount. 

• The ABM11 and HERAPDF1.5 central predictions for gluon fusion are contained 
within the envelope of the NNPDF2.3, CT10 and MSTW results. However, the 
HERAPDF1.5 uncertainty is bigger than this envelope. The agreement with ABM11 
would be spoiled if their default value of a s (Mz) = 0.1134 were used. 
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Gluon Fusion (pb) 



a S (Mz) 


NNPDF2.3 


MSTW08 


CT10 


ABM11 


HERAPDF1.5 


0.117 
0.119 


18.90 ± 0.20 
19.54 ± 0.25 


18.45 ± 0.24 
19.12 ± 0.25 


18.05 ± 0.36 
18.73 ± 0.37 


18.11 ± 0.41 
18.71 ± 0.42 


18.34 ± 1.03 
18.94 ± 1.07 



Vector Boson Fusion (pb) 



as(Mz) 


NNPDF2.3 


MSTW08 


CT10 


ABM11 


HERAPDF1.5 


0.117 
0.119 


1.635 ± 0.020 
1.644 ± 0.020 


1.655 ± 0.029 
1.658 ± 0.029 


1.681 ± 0.030 
1.686± 0.030 


1.728 ± 0.020 
1.731 ± 0.020 


1.668 ± 0.051 
1.673 ± 0.051 




WH production (pb) 


a s {M z ) 


NNPDF2.3 


MSTW08 


CT10 


ABM 11 


HERAPDF1.5 


0.117 
0.119 


0.739 ± 0.010 
0.747 ± 0.010 


0.746 ± 0.011 
0.752 ± 0.011 


0.738 ± 0.016 
0.745 ± 0.016 


0.784 ± 0.010 
0.789 ± 0.010 


0.751 ± 0.023 
0.754 ± 0.023 



ttH associated production (fb) 



as(Mz) 


NNPDF2.3 


MSTW08 


CT10 


ABM11 


HERAPDF1.5 


0.117 
0.119 


72.8 ± 2.1 
75.1 ± 2.0 


74.6 ± 1.6 
77.3 ± 1.6 


71.6 ± 3.4 
76.1 ± 3.4 


66.6 ± 2.0 
69.4 ± 2.0 


76.2 ± 9.0 
79.4 ± 9.0 



Table 3: The cross sections for Higgs production at 8 TeV in various channels using the settings 
described in the text. From top to bottom: gluon fusion, vector boson fusion, WH production and 
ttH production. We have assumed a Standard Model Higgs boson with mass mjj = 125 GcV. We 
show the results for two different values of as(Mz), 0.117 and 0.119. 



tt production (pb) 



a s (Mz) 


NNPDF2.3 


MSTW08 


CT10 


ABM11 


HERAPDF1.5 


0.117 
0.119 


217.9 ± 4.8 
227.8 ± 5.0 


222.5 ± 5.5 
232.1 ± 5.8 


218.0 ± 7.8 
227.6 ± 8.2 


199.7 ± 5.5 
211.2 ± 5.8 


225.1 ± 26.1 
237.5 ± 27.5 



Table 4: Same as Tab. [3] for the cross sections for top quark pair production at 8 TeV at 
NNLOapprox+NNLL, using top++ with the settings described in the text. We have assumed a 
top quark mass of m t = 173.2 GeV. 

• For VBF, WH and ttH production, there is a reasonable agreement between CT10, 
MSTW and NNPDF2.3 both in central values and in the size of PDF uncertainties. 
ABM 11 instead leads to rather different results, even when a common value of a s 
is used. For quark-initiated processes, like VBF and WH, the ABM11 cross section 
is higher than that of the other sets, especially for WH production. For ttH, which 
receives the largest contribution from gluon-initiated diagrams, the ABM11 cross 
section is smaller. 

• The HERAPDF1.5 PDF uncertainties are distinctly larger, especially for ggH and 
ttH, mostly due to fact that HERA data do not constrain well the large- x gluon. 

A more detailed discussion of the interplay of PDF and a s uncertainties for Higgs produc- 
tion, focused on the gluon fusion channel, will be presented in Sect. [6] below. 

Next we consider inclusive top quark pair production. Theoretical progress towards 
the full NNLO result has been made recently [41H46] . including the recent calculation of 
the full NNLO qg initiated contribution [53] (which amounts to a small 0(1%) correction, 
contrary to previous approximate estimates [E]). The approximate NNLO top quark 
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a{W+) (nb) 


as(M z ) 


NNPDF2.3 


MSTW08 


CT10 


ABM11 


HERAPDF1.5 


0.117 


6.937 ± 0.097 


6.967 ± 0.118 


6.990 ± 0.150 


7.419 ± 0.107 


7.088 ± 0.189 


0.119 


7.045 ± 0.094 


7.072 ± 0.118 


7.107 ± 0.151 


7.509 ± 0.105 


7.140 ± 0.191 





a(W-) (nb) 


as(Mz) 


NNPDF2.3 


MSTW08 


CT10 


ABM11 


HERAPDF1.5 


0.117 


4.855 ± 0.058 


4.945 ± 0.083 


4.857 ± 0.111 


5.073 ± 0.079 


4.987 ± 0.117 


0.119 


4.906 ± 0.061 


5.004 ± 0.083 


4.940 ± 0.112 


5.136 ± 0.078 


5.027 ± 0.118 





a(Z) (nb) 


a s (Mz) 


NNPDF2.3 


MSTW08 


CT10 


ABM 11 


HERAPDF1.5 


0.117 
0.119 


1.120 ± 0.013 
1.127 ± 0.013 


1.128 ± 0.019 
1.141 ± 0.019 


1.126 ± 0.024 
1.144 ± 0.019 


1.179 ± 0.016 
1.192 ± 0.017 


1.135 ± 0.033 
1.145 ± 0.033 






a{W + )/a(W~) 


as(Mz) 


NNPDF2.3 


MSTW08 


CT10 


ABM 11 


HERAPDF1.5 


0.117 
0.119 


1.429 ± 0.013 
1.436 ± 0.012 


1.409 ± 0.011 
1.413 ± 0.011 


1.439 ± 0.013 
1.439 ± 0.013 


1.462 ± 0.015 
1.462 ± 0.015 


1.421 ± 0.013 
1.420 ± 0.013 





a(W)/a(Z) 


as(Mz) 


NNPDF2.3 


MSTW08 


CT10 


ABM11 


HERAPDF1.5 


0.117 


10.523 ± 0.035 


10.560 ± 0.018 


10.521 ± 0.068 


10.595 ± 0.024 


10.639 ± 0.057 


0.119 


10.604 ± 0.035 


10.583 ± 0.018 


10.532 ± 0.068 


10.608 ± 0.024 


10.626 ± 0.057 



Table 5: The inclusive cross sections for electroweak gauge boson production at 8 TeV at NNLO 
using the Vrap code, obtained using various NNLO PDF sets, for different values of a s . From 
top to bottom we show the results for the W + , W~ and Z total cross sections and then for the 
W + jW~ and W/Z cross section ratios. We include also the recent CMS measurements. 



pair production cross sections at 8 TeV for different PDF sets and for different values of 
as(Mz) are been collected in TablelH In all cases the same value of as is used consistently 
in the PDFs and in the matrix element calculation. Results are also shown in Fig. [l"0l 
and compared to the recent CMS measurements [47jl. The variation in the cross sections 
with a s shows that the ti total cross section has some sensitivity to the value of a s . 
This sensitivity has been recently used by CMS to provide the first ever determination of 
a s from top cross sections [35]. For the ti cross section, we see a reasonable agreement 
between NNPDF2.3, CT10 and MSTW, while ABM11 is somewhat lower. Using the 
default value of a s = 0.1134 in ABM11 would make the difference even more marked. The 
HERAPDF1.5 central value is in good agreement with the global fits but, as usual, the 
PDF uncertainties are larger. 

Finally, we discuss the inclusive electroweak gauge boson production at 8 TeV. Here 
we can also compare with the recent CMS measurements [49]. The cross section results 
for a s = 0.117 and 0.119 are collected in Table [5j where from top to bottom we show 
the results for the W + , W~ and Z total cross sections and then for the W + /W~ and 
W/Z cross section ratios. Results are collected graphically and compared to the recent 
CMS data in Fig. [TTJ In the figure we show results only for as(Mz) = 0.118, since the 
strong coupling dependence of these cross sections is rather mild, particularly for the cross 

3 We take the average of the cross section in the di-lepton and lepton+jets final states. 
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section ratios. 

We find good agreement between MSTW, CT10 and NNPDF2.3 and HERAPDF1.5: 
this is to be expected, since from Fig. [TJwe know that the respective qq parton luminosities 
are similar in the relevant regions. On the other hand, ABM11 leads to systematically 
higher cross sections (particularly for the u-quark dominated cross sections), consistent 
with the larger luminosities seen in Fig. [7J The available LHC 8 TeV data is in good 
agreement with the theory predictions, perhaps disfavoring the harder ABM11 cross sec- 
tions, although the accuracy is not enough for full discrimination. Future data for lepton 
differential distributions at 8 TeV will be an important ingredient for the next generation 
of PDF determinations. 
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Figure 9: Comparison of the predictions for the LHC Standard Model Higgs boson cross sections 
at 8 TeV obtained using various NNLO PDF sets. From top to bottom we show gluon fusion, 
vector boson fusion, associated production (with W), and associated production with a tt pair. 
The left hand plots show results for as(Mz) = 0.117, while on the right we have as(Mz) = 0.119. 
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Figure 10: Comparison of the predictions for the top quark pair production at LHC 8 TeV obtained 
using various NNLO PDF sets. Left plot: results for as(Mz) = 0.117. Right plot: results for 
as{Mz) = 0.119. In both cases we also show the recent CMS 8 TeV measurement. 
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Figure 11: Comparison of the predictions for inclusive cross sections for electroweak gauge boson 
production between different PDF sets at LHC 8 TeV. In all cases the branching ratios to leptons 
have been included. From top to bottom and from left to right we show the W + , W~ , and Z 
inclusive cross sections, and then the W + /W~ and W/Z ratios. All cross sections are compared 
at a common value of as{Mz) = 0.118. We also show the recent CMS 8 TeV measurements. 
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4 PDF dependence of LHC differential distributions 



We now study the PDF dependence of LHC differential distributions. Since we want to 
quantify the agreement between data and theory we consider only the LHC data sets for 
which the the full experimental covariance matrix is available. These were all taken at 7 
TeV centre of mass energy: the 8 TeV data on differential distributions have yet to be 
released. We will provide a comparison of theory and data for electroweak vector boson 
and inclusive jet production, and examine whether these data can discriminate between 
the PDFsQ In the next section we will present a more detailed study of jet production, 
including comparison between different codes, a discussion of scale dependence, and a 
study of systematic shifts for each PDF set in the description of ATLAS data. 
Specifically, the experimental data that we consider in this section is: 

• The ATLAS measurement of the W lepton and Z rapidity distributions from the 
2010 dataset (36 pb" 1 ) [52]. 

• The CMS measurement of the electron asymmetry with the 2011 dataset (840 pb -1 ) [53]. 

• The LHCb measurements of the W + and W~ lepton level rapidity distributions in 
the forward region from the 2010 data set [54] . 

• The ATLAS measurement of the inclusive jet production from the 2010 dataset 
(36 pb" 1 ) [55]. We consider the R = 0.4 dataset only, very similar results are 
obtained if the R = 0.6 radius is also usedH 

Theoretical predictions have been obtained as follows: 

• For electroweak vector boson production, we have computed differential distributions 
at NLO with the MCFM code [58] interfaced to the APPLgrid software [59] that allows a 
fast computation of the observable when PDFs are varied, and cross checked against 
the DYNNLD code [60J. For ATLAS W,Z data we have also cross-checked against 
the APPLgrid implementation used in the ATLAS strangeness determination [24] . 
NNLO predictions have been obtained using local K-factors determined with DYNNL0. 

• For inclusive jet production, we have used the NL0jet++ program interfaced to the 
APPLgrid software. The scale is chosen to be the pr of the hardest jet in the event 
within each rapidity bin. Comparisons with FastNLO [61] and MEKS [62] are presented 
in the next section. Note that, even though NNLO PDFs are used, the accuracy of 
the calculation is NLO, as NNLO partonic cross sections are not yet available. 

In order to provide quantitative comparisons we compute the x 2 using different PDF 
sets. Note that, unlike other sets, NNPDF2.3 already includes these data in their fit, so 
it necessarily provides a good description of all of them. For consistency of comparison, 

4 In addition to these sets, ATLAS data on differential top quark pair production have been recently 
presented [50]. They include the experimental covariance matrix, hence they could be included in global 
PDF fits to constrain the giuon PDF. We do not consider inclusive photon production, since the covariance 
matrix is not available. The impact of the photon data on the PDF analysis was studied in Ref. |51| . 

5 Recently, the ratio of these jet cross sections to the 2.76 TeV ones where also presented [56], although 
in preliminary form. These cross section ratios [57] have the potential to improve the PDF constraints as 
compared to the 7 TeV data alone, thanks to the cancellation of systematic uncertainties. 
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NLO a s = 0.117 


Dataset 


NNPDF2.3 


MSTW08 


CT10 


ABM11 


HERAPDF1.5 


ATLAS W, Z 


1.234 


1.993 


1.047 


1.472 


1.719 


CMS W el asy 


0.884 


4.694 


1.458 


1.961 


0.671 


LHCb W 


0.658 


0.869 


0.994 


2.272 


2.885 


ATLAS jets 


0.916 


0.893 


1.212 


1.409 


0.968 





NNLO a s = 


0.117 






Dataset 


NNPDF2.3 


MSTW08 


CT10 


ABM11 


HERAPDF1.5 


ATLAS W, Z 


1.382 


3.194 


1.125 


1.923 


1.845 


CMS W el asy 


0.828 


4.140 


1.778 


1.602 


0.817 


LHCb W 


0.741 


0.956 


0.892 


1.873 


0.744 


ATLAS jets 


0.862 


0.828 


0.940 


0.963 


0.848 



Table 6: The X 2 /N p t values the available LHC data with published correlated uncertainties, 
computed using different PDF sets. The theoretical predictions have been computed at NLO 
(upper table) and at NNLO (lower table) using APPLgrid for a common value of the strong coupling 
a s (Mz) = 0.117. The experimental definition of the covariance matrix (cov)y is used, sec Eq. ([7]). 

we use the same definition Eq. ([7]) of the x 2 with the experimental covariance matrix 
Eq. ([8]), even though this is not in general the quantity which has been minimized when 
determining PDFs. Results at NLO and at NNLO are summarized in Tables [6] and [71 
where common values of a s (Mz) = 0.117 and a s (Mz) = 0.119 respectively have been 
used. 

The main conclusions which can be drawn from these comparisons are the following: 

• All PDF sets lead to predictions in reasonable agreement with ATLAS jet data. In 
general, the description improves when NNLO PDFs are used as compared to NLO 
PDFs. While the ATLAS jet data appear to have only moderate constraining power, 
larger impact is expected when the full 7 TeV 5 fb _1 data from CMS and ATLAS 
will become available. 

• The ATLAS and CMS electroweak data appear to have considerable discriminat- 
ing power, and thus are likely to constrain significantly quarks and anti-quarks at 
medium and small-x, and specifically strangeness [24]. The worst description of the 
electroweak data is provided by MSTW08: this will be discussed in more detail 
below. 

• The LHCb data also appears to have discriminating power. This data is sensitive 
to flavor separation at the smallest values of x, and to fairly high-x quarks, thanks 
to the forward coverage of the LHCb detector. Predictions obtained using all PDF 
sets describe the data quite well, with the exception of ABM11. It should be noticed 
that while at NNLO HERAPDF1.5 agrees with the data, at NLO instead it provides 
a poor description, due to the large antiquark PDF at high x. 

The main reason why MSTW08 provides a rather poor description of the ATLAS W, Z, 
and especially of the CMS W data is understood [T71[T8] as a consequence of the behavior 
of the u v — d v distribution around x ~ 0.03. Indeed, in Ref. |17j it is shown that once 
the LHC W asymmetry data is included in MSTW08 using PDF reweighting |63|I64], the 
fit quality improves substantially. In [18] it is shown that an extended parameterisation 
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NLO a s = 0.119 


Dataset 


NNPDF2.3 


MSTW08 


CT10 


ABM11 


HERAPDF1.5 


ATLAS W, Z 


1.271 


2.003 


1.061 


1.561 


1.757 


CMS W el asy 


0.822 


4.698 


1.421 


1.929 


0.693 


LHCb W 


0.673 


0.919 


1.063 


2.332 


4.124 


ATLAS jets 


1.004 


0.972 


1.352 


1.345 


1.111 





NNLO a s = 


0.119 






Dataset 


NNPDF2.3 


MSTW08 


CT10 


ABM11 


HERAPDF1.5 


ATLAS W, Z 


1.435 


3.201 


1.160 


2.061 


1.872 


CMS W el asy 


0.813 


3.862 


1.772 


1.614 


0.814 


LHCb W 


0.831 


1.050 


0.966 


1.970 


0.784 


ATLAS jets 


0.937 


0.935 


1.016 


0.959 


1.011 



Table 7: Same as Tabic El but for a s (M z ) = 0.119. 



for quarks (and to a lesser extent a consideration of deuteron corrections) automatically 
alters the form of u v — d v for the standard MSTW08 fit in the relevant region without 
including new data, and the predictions for the asymmetry improve enormously — the 
X 2 for the prediction for the asymmetry data decreases to about one per point. It is also 
demonstrated explicitly that this is a very local discrepancy which has a very small effect 
on more inclusive cross sections, much less than PDF uncertainties. 

We can also compare the agreement of the different PDF sets with the data by ex- 
amining plots, although of course this will be less quantitative than the x 2 comparison. 
Note, in particular that the correlated systematics (shown as a band in the bottom of each 
plot) is quite large, and typically dominates over the uncorrelated statistical uncertainty. 
As a consequence, it is difficult to judge the fit quality by simple inspection of the plots. 
As before, we show on the one hand a comparison of NNPDF2.3, CT10 and MSTW, and 
on the other of NNPDF2.3, ABM 11 and HERAPDF1.5. The comparison for the ATLAS 
electroweak boson production data is shown in Fig. [T2l for CMS and LHCb W production 
in Fig. [13| and for ATLAS inclusive jet data in Fig. [TU We show only a subset of all the 
possible comparisons, and only for a s = 0.118; a fuller set of plots can be found at the 
HepForge link mentioned previously. 
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Figure 12: Comparison of the ATLAS electroweak vector boson production data with the 
NNPDF2.3, CT10 and MSTW2008 predictions with a s = 0.118. The error bars correspond to 
statistical uncertainties, while the band in the bottom of the plot indicates the correlated system- 
atics (including normalization errors). 
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Figure 13: Same as Fig. QJ for CMS and LHCb W production. 
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Figure 14: Comparison of the ATLAS R = 0.4 inclusive jet production data from the 2010 datasct 
with the NNPDF2.3, CT10 and MSTW2008 NNLO PDF sets and a s = 0.118. The error bars 
correspond to statistical uncertainties, while the band in the bottom of the plot indicates the 
correlated systematics (including normalization errors) 
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5 ATLAS inclusive jet production at NLO 



As outlined in the last section, jet production is one of the cornerstone processes of the 
physics program at the LHC. It has reached unprecedented statistical precision and can 
serve both for detailed tests of perturbative QCD and searches for hypothetical new in- 
teractions. Inclusive jet production measurements impose direct constraints on the gluon 
PDF, and the LHC data can in principle be sensitive to the gluon PDF in a very wide 
range of momentum fractions x [65]. Inclusive jet production at the Tevatron and LHC 
can be used to reduce the gluon uncertainty, and thus improve the predictions for impor- 
tant processes like Higgs production in gluon fusion. The last section gave a brief outline 
of the current comparison of the QCD predictions with various PDF sets to the current 
ATLAS data, but here we point out some more detailed features of the analysis, which 
will become more important as the precision of the data collected improves. 

There exist two independent computer programs for computing single-inclusive jet and 
dijet production at NLO at the parton level, EKS [66] and NL0jet++ [671168], The EKS code 
was written in the early 1990's and was used to tabulate point-by-point NLO/LO K factors 
for jet production in previous CTEQ global fits. As the precision of the jet data increased, 
it became necessary to develop a new version of EKS with enhanced numerical stability 
and percent-level accuracy. It also became clear that the PDFs that are constrained by 
the jet cross sections may depend on the theoretical assumptions made in the computation 
of NLO theoretical cross sections. To address this issue, a deeply revised version of the 
EKS code, designated as MEKS [62] , was recently released and compared against the other 
independent code, NL0jet++ [67j[68]- This study documented specific settings in the two 
codes that bring them into agreement to within 1-2% at both the Tevatron and LHC. 

The MEKS and NL0jet++ calculations are relatively slow and require significant CPU 
time to reach acceptable accuracy, so that their direct use in the PDF fits is impractical. 
Instead, the global PDF analyses reproduce the NLO cross sections by fast numerical 
approximations. Besides the interpolation of the tabulated NLO/LO K factors that was 
utilized until recently by CTEQ, a more flexible approach is provided by the programs 
FastNLD [61,69,70] and APPLgrid [59]. They quickly and accurately interpolate the tables 
of NLO jet cross sections initially computed in NL0jet++. The threshold corrections to 
inclusive jet production of 0{a 2 s ) [71] are also available as an estimate of the unknown 
NNLO termsH 

Besides fixed-order QCD calculations, NLO event generators such as POWHEG |72j 
and SHERPA [73] combine the NLO hard cross section for inclusive jet production with 
leading-log showering evaluated by HERWIG or PYTHIA. POWHEG predictions for AT- 
LAS jet production are different from the fixed-order predictions [55] and also show quite 
a strong dependence on the parton showering, even at the highest pt, while the SHERPA 
results are in general closer to NLO. Only fixed-order calculations will be considered in 
the rest of this section. Electroweak corrections to dijet production have also been studied 
in Refs. [74l[75]. 

In their most recent PDF sets, FastNLO is used by the CTEQ and MSTW groups, while 
APPLgrid is used by NNPDFjil Predictions from either program depend significantly on the 

6 Threshold corrections are not included in this study. 

7 Since NNPDF2.0 [75], the Tevatron jet data is included in NNPDF with the FastNLO package. In the 
recent NNPDF2.3, the FastNLO grid tables are accessed using the APPLgrid wrapper. 
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choices for the QCD renormalization and factorization scales (fin and //p), recombination 
scheme, and realization of the jet algorithm [62] . In the case of inclusive jet production, 
the default hard scale specifying the fip and fiR values in each event can be taken to be 
equal to ll px of each individual jet" (FastNLO version 2), ll px of the hardest jet", ll px of 
the hardest jet in each rapidity bin" (APPLgrid), "average px in each px bin (FastNLO 
version 1)". Differences between these choices are relevant in modern comparisons, as will 
be shown below. Similar ambiguities are present in computations for dijet production. 
We will explicitly distinguish between these various scale prescriptions to avoid a common 
inaccuracy of referring to all of them as "the scales that are equal to jet pr" ■ 

5.1 Comparison of computer programs and scale dependence 

In this section, we compare predictions of APPLgrid, FastNLO, and MEKS for inclusive jet 
production in ATLAS at 7 TeV [55]. In Fig. [HI the 2010 ATLAS data set (with the jet 
cone size R = 0.4) is compared to NLO predictions from APPLgrid, FastNLO (version 2), 
and MEKS, using the NNPDF2.3 NLO PDF set. The cross sections are plotted vs. jet 
transverse momentum, px, in seven bins of the magnitude \y\ of jet rapidity. The error 
bars show the experimental data with the statistical and uncorrelated systematic errors 
added in quadrature: no correlated systematic shifts are included. In each px bin, all cross 
sections are normalized to the corresponding prediction from FastNLO. The cross sections 
of the central FastNLO prediction are computed using the px of each individual jet as the 
renormalization and factorization scale, [1r = \xp = p™ d . Hatched bands represent a scale 
variation of the FastNLO predictions, obtained by varying /z/j and \xp separately in the 
intervals p™ d /2 < \irf < 2p™ d . Three colored lines correspond to two predictions from 
MEKS and a prediction from APPLgrid. 

The MEKS cross sections are obtained with the scales equal to the individual jet p™ d 
(same as in FastNLO and denoted by MEKSl) or the hardest jet p T ard in each event (MEKS2). 
In the MEKSl convention, if the transverse momenta {pt} = {Pt iPt iPt S °^ ^he J e ^ s m a 
three-jet event are ordered as pip > p£/ > pp , the event contributes cross section weights 
w ({pt}, A* = Pp), w({px},fi = Pp), and w({px},p- = Px) m to the px bins around Pj<\ 

(2) (3) 

p T , and p T , respectively. In the MEKS2 convention, the event contributes the same cross 

section weight w({px}, \i = Px) into all three bins. 

The scale choice in APPLgrid sets fin and fip equal to the px of the hardest jet in 
each rapidity bin. It coincides with the MEKSl convention if all pt values fall into different 
rapidity bins, but will select the larger of the two px values as the scale if two jets are in 
the same rapidity bin. 

In Fig. [151 we can see that, at the largest px values, all four predictions agree to 
within 1%. FastNLO and MEKSl agree to about 1% even at low px, apart from minor 
fluctuations caused by Monte-Carlo integration errors. Their agreement is not surprising, 
since FastNLO and MEKSl follow the same scale choice. 

At low px, the APPLgrid event rate shows a systematic deficit of up to 4% compared 
to FastNLO, while the MEKS2 rate is even smaller in this region. This is the consequence of 
using the QCD scale that is equal or close to the hardest jet px, which suppresses the cross 
section compared to other scale choices. The MEKS2 curve lies, for the most part, within the 
scale uncertainty band of the FastNLO prediction, with the exception of the px < 200 GeV 
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region. We conclude that the most up-to-date versions of the parton-level NLO programs 
show a very good agreement for the same scale choice. However, the scale dependence 
of the NLO cross section is an important systematic uncertainty, its magnitude is of the 
same order as the experimental correlated systematic errors. In Fig. [151 which shows 
the experimental data without the correlated systematic errors, the difference between 
the theoretical predictions and the unshifted central data values provides a crude 
estimate of the size of the correlated systematic error. As seen in the last section, the 
quality of the fit is very good, so the data and theory predictions can be brought into line 
using shifts of data corresponding to the size of the correlated errors, or less. In fact, it 
can be checked from the results in [55] that this is a reasonable approximation, especially 
at the highest pt values and the highest rapidity bins, where the systematic uncertainty 
is larger than the difference between and in Fig. [15] The scale uncertainty, defined 
as above, varies from about 15% of Tf. — in the bins with the small rapidity to 40% 
at the largest \y\. Hence, the contribution of the scale uncertainty is significant compared 
to the experimental systematic uncertainty, and reduces the sensitivity of LHC inclusive 
(di)jet production to different PDF models, particularly at the highest rapidities. 

5.2 PDF dependence 

As already seen in the previous section, all available PDF sets can fit well the current 
ATLAS jet data, which therefore does not provide much discrimination. However, there 
are still interesting features to pick out which will become more important for future data. 
Fig. [16] compares the corresponding NLO predictions made using APPLgrid and various 
NNLO PDFs: ABM11, CT10, HERA1.5, MSTW08, and NNPDF2.3. We take a s (M z ) = 
0.119 both in the hard cross sections and PDFs for all PDF sets. All the predictions are 
normalized to the central prediction based on the CT10 NNLO PDF set (with a s = 0.119). 
For the NNPDF2.3 and CT10 sets, we show the 68% C.L. PDF uncertainties by the 
hatched bands. The CT10 central predictions are larger than NNPDF2.3 or MSTW2008, 
mainly due to the harder gluon distribution in the CT10 set. In general, predictions 
from different PDFs agree with each other within the range of PDF uncertainties, apart 
from ABM11, particularly at low rapidities. It is also instructive to compare the scale 
uncertainties shown in Fig. [15] with the PDF uncertainties shown in Fig. [16] In the low pr 
region, i.e. less than p? ~ 200 GeV, the scale uncertainty of NLO predictions is comparable 
to, or even larger than, the PDF uncertainties from CT10. This is another indication that 
the scale uncertainty presents a limiting factor in the discrimination between the PDF 
sets, especially for PDFs which are already well-constrained, in this case by HERA data. 

5.3 Systematic shifts in a fit to the ATLAS jet data 

When the NLO theoretical predictions are compared to the ATLAS inclusive jet data 
without including the systematic errors, as in Fig. \TE\ one generally finds a very poor 
agreement for any PDF set. In this case, the x 2 value can reach several thousand units 
for a total of N pt = 90 data points. The agreement is improved dramatically after the 
correlated systematic errors are considered. This can be done, e.g., by including a term 
with a correlation matrix /3fc Q into the log- likelihood function \ 2 as described in 

the appendix. We will use the definition of \ 2 provided by Eq. (Ilip . which introduces a 
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PDF set 


X 2 /N pt Xd x{ Vium 


ABM11 


0.81 44.4 28.5 -1.12 


CT10 


0.81 47.4 25.5 -1.76 


CTIO NLO 


0.94 54.0 30.6 -1.18 


HERA 1.5 


0.85 50.7 25.8 -2.36 


MSTW08 


0.79 45.7 25.1 -2.00 


NNPDF2.3 


0.79 42.4 29.1 -1.88 



Table 8: x 2 / N pt values for the 2010 ATLAS single-inclusive jet data (R = 0.4) computed 
according to Eq. (jlip using FastNLO (version 2) and various NNLO PDF sets and the 
CT10 NLO set. The xb an( ^ x\ contributions to \ 2 from the data residuals and penalties 
for systematic shifts defined in Eqs. (|12p and (|13|) are shown. The last column contains 
the best-fit luminosity parameter shift Ao,i U m for each PDF set. We have used a s = 0.119 
for all sets. 

normally distributed nuisance parameter A a (with the central value of zero and standard 
deviation of one) to characterize each of N\ correlated errors. 

The ATLAS measurement provides 88 sources of correlated systematic errors, including 
the luminosity error and the uncertainty in the nonperturbative correction. Each of these 
errors can cause variations (shifts) of the experimental points from their central values. 
In addition, each data point is affected by an uncorrelated systematic error, which is 
significant compared to the statistical error. When both uncorrelated and correlated 
systematic uncertainties are included into x 2 > the resulting xV-^pt values are less than 1 
for all considered NNLO PDFs, as shown in Table [BJ In this comparison, x 2 is computed 
according to the procedure summarized in Sec. IA.21 and numerically equivalent to Eq. (jSj). 
None of the PDF sets is preferred by these x 2 values. As one can see the x 2 values are 
extremely similar to those in the previous section, Tables [6] and [7j even though they are 
computed with a different code (FastNLO). 

For each set of theoretical predictions {Tfc}, we can also determine the value Ao Q of each 
nuisance parameter that gives the best description of the data. It is found according to 
Eq. (|14p once the {T^} values are known. In Eq. (jlip for the total x 2 > we can identify two 
parts: xb containing contributions from the data residuals dk = (£)| hlfted — Tk)/sk, where 
^shifted _ _ {3 ka \ 0a ; and x\i which is a quadrature sum ^ Q A„ of the shifted nui- 
sance parameters. We list xb an d x\ separately in Tableland include histograms of the 
data residuals dk and best-fit parameters Ao« in Figs. [T7] and [181 In the histograms (which 
are shown here for CT10 NNLO and NNPDF2.3 NNLO PDFs, but are also representative 
of the histograms for the other NNLO PDF sets), the observed dk and Ao a distributions 
are narrower than the standard normal distributions shown by the dotted curves. In other 
words, the fit to the 2010 ATLAS data is too good and can't distinguish between the PDF 
sets. Most of 88 best-fit parameters A a o are close to zero, i.e., they don't contribute much 
to the improvement of x 2 ■ None of the best-fit parameters included in Fig. [18] has changed 
by more than 2.5 standard deviations. At the Tevatron, some PDF sets required a shift in 
the nominal luminosity upwards by as much as 3-4 standard deviations in order to agree 
with the single-inclusive jet production data, cf. the appendix in Ref. [30]. In that paper, 
it was argued that such shifts are not strictly allowed. The luminosity is common to the 
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data on the Z and W total cross sections and the Z rapidity distribution, which are rather 
constraining, and for which the PDF predictions are consistent with the nominal luminos- 
ity, or even a shift in the downwards direction. It should be a mandatory test of PDFs 
that they fit the Tevatron and LHC jet and vector boson production data simultaneously, 
while the luminosity uncertainty is treated as completely correlated between the two types 
of measurement coming from the same experiment and the same data taking period. This 
has not been checked for all PDF sets and could help explain how some inconsistencies 
may arise. Note that Fig. 17 in Rcf. [30j is of the same form as Fig. [THl but for Tevatron 
jet data. For the Tevatron inclusive jet data the distribution of the Ao Q is as expected, or 
even wider for poorly fitting PDFs, in contrast to those for ATLAS data. 

The last column of Table [8] lists the best-fit values of the luminosity shift parameter in 
the ATLAS measurement, computed with the FastNLO code. Only one PDF set (HERA1.5 
NNLO) requires a 2. 4<r shift in the ATLAS luminosity. However, none of the PDF sets 
requires a luminosity shift by more than 3<r, suggesting that they are all compatible with 
the 2010 ATLAS jet data. This is despite the wide variety of predictions exhibited in 
Fig. HHJ Clearly the improvement of the correlated systematic errors will be a priority for 
future data, since at present the shifts in data can accommodate quite dramatic differences 
in predictions without a large penalty in \ 2 ■ 
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Figure 15: Comparison of NLO theoretical predictions obtained with various numeri- 
cal programs for the 2010 ATLAS measurement of single-inclusive jet production [55]. 
NNPDF2.3 NLO PDFs and a s (M z ) = 0.119 are used with all programs. 
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ATLAS inc. jet (2010, 7 TeV, R=0.4) 
Ratio w.r.t. CT10 NNLO 
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'^■/^//. CT10 
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Figure 16: Comparison of NLO theoretical predictions obtained with various NNLO PDF 
sets for the 2010 ATLAS measurement of single-inclusive jet production [55]. APPLgrid 
and a s {Mz) = 0.119 are used with all PDF sets. 
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Figure 17: Distribution of residuals for the fit of 2010 ATLAS single-inclusive jet data (R = 
0.4). Left (right) plot corresponds to using NLO theoretical predictions from FastNLO v. 2 
with CT10 (NNPDF2.3) NNLO PDFs and a s (M z ) = 0.119. 
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Figure 18: Distribution of best-fit nuisance parameters \ a for the fit to the 2010 ATLAS 
single-inclusive jet data (R = 0.4). Left (right) plot corresponds to using NLO theoretical 
predictions from FastNLO v.2 with CT10 (NNPDF2.3) NNLO PDFs and a s (M z ) = 0.119. 
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6 Combined uncertainties in Higgs production 



In this section we discuss in somewhat greater detail PDF and a s uncertainties for Higgs 
production via gluon fusion at the LHC, and specifically how PDF updates affect results 
obtained using the PDF4LHC recommendation [78] for the determination of PDF+a s 
uncertainties. At NLO this prescription entails finding the envelope of CT, MSTW and 
NNPDF PDF+a s uncertainty bands, each obtained with a different choice for the central 
value of a s . The outer bands of the envelope are taken as the upper and lower limits of 
uncertainty, and the midpoint value as the best prediction. When the prescription was 
published, of the three PDF sets included in the prescription, only MSTW was available 
at NNLO. The NNLO prescription recommended taking the MSTW08 prediction as the 
central value, while rescaling the MSTW08 uncertainty by a factor determined comparing 
at NLO the MSTW08 uncertainty to the envelope uncertainty. 

The NNLO cross section for Higgs production at LHC (8 TeV) is currently quoted by 
the Higgs Cross Section Working Group (HXSWGjl as 

^nnlo = 1Q 52 ± L41 pbj (±7.2% "PDF + a s "). (3) 

The HXSWG cross section numbers have been computed with the current (2010) PDF4LHC 
prescription, mu = 125 GeV, and de Florian-Grazzini code |79| . which incorporates soft- 
gluon effects up to next-to-next-to-leading logarithmic accuracy on top of the exact NNLO 
calculation. Since in this work we use fixed order NNLO calculations as implemented in 
iHixs, the central values that we will quote cannot be compared directly to the HXSWG 
numbers. However, this should have a minimal effect on the percentage PDF+a s uncer- 
tainty. 

We can thus investigate how the combined PDF+a s uncertainties would change if 
computed using an envelope prescription based on the most updated NNLO PDFs from the 
three global sets: NNPDF2.3, MSTW08 and CT10. Instead of the exact implementation of 
the PDF4LHC envelope, see e.g., Refs. [T |I80| . for simplicity we use the following definition: 
we compute the combined PDF+a s uncertainties for the three PDF sets for a s = 0.117 
and a s = 0.119 and let the maximum and minimum values of the cross section in this 
range define the envelope. Combined PDF and a s uncertainties are obtained adding the 
two uncertainties in quadrature. The uncertainty on a s is taken to be 5a s = 0.0012 at the 
68% confidence level. The central value is taken as the midpoint of the envelope defined 
in this way. 

This differs from the 2010 PDF4LHC prescription because in the latter the prediction 
from each of the three sets is obtained using a different value of a s (a s = 0.118 for CTEQ, 
a s = 0.119 for NNPDF and a s = 0.120), and also because a s and PDF uncertainties are 
added in quadrature instead of being determined exactly in the Hessian or Monte Carlo 
method. The change in a s range moves the central value a little, however, because the 
width of the a s range is unchanged the uncertainty is not affected significantly. Adding 
the PDF and a s uncertainties in quadrature reduces somewhat the MSTW08 uncertainty. 

As in Sect.[3j the cross sections are computed at NNLO with the iHixs code [33]. The 
central scale has been taken to be Q = mu, following the recommendations of the Higgs 
Cross Section Working Group. 

8 https://twiki.cern.ch/twiki/bin/view/LHCPhysics/CERNYellowReportPageAt8TeV 
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2010 NLO PDFs 



a s (M z ) 


NNPDF2.0 


MSTW08 


CTEQ6.6 


0.117 
0.119 


14.04 ± 0.20 ± 0.27 
14.49 ± 0.21 ± 0.27 


13.94 ± 0.22 ± 0.27 
14.38 ± 0.23 ± 0.27 


13.49 ± 0.27 ± 0.24 
13.88 ± 0.28 ± 0.24 




2012 NLO PDFs 


a s (Mz) 


NNPDF2.3 


MSTW08 


CT10 


0.117 
0.119 


14.21 ± 0.20 ± 0.25 
14.61 ± 0.17 ± 0.25 


13.94 ± 0.22 ± 0.27 
14.38 ± 0.23 ± 0.27 


13.57 ± 0.28 ± 0.26 
14.00 ± 0.29 ± 0.26 




2012 NNLO PDFs 


a s (M z ) 


NNPDF2.3 


MSTW08 


CT10 


0.117 
0.119 


18.90 ± 0.20 ± 0.38 
19.54 ± 0.25 ± 0.38 


18.45 ± 0.24 ± 0.40 
19.12 ± 0.25 ± 0.40 


18.05 ± 0.36 ± 0.41 
18.73 ± 0.37 ± 0.41 



Table 9: The Higgs boson production cross section (in pb) in the gluon fusion channel, for 
win = 125 GeV at LHC 8 TeV. The two uncertainties shown in each case are the PDF and a s 
uncertainty. 

We begin by computing the envelope denned as above at NLO with the same NLO 
PDF sets of 2010 PDF4LHC prescription: CTEQ6.6, MSTW08, and NNPDF2.0. The 
corresponding results for a s = 0.117 and 0.119 are summarized in Table The envelope 
is 

a NLO = 13 98 ± Q 85 (±6.1% "PDF + a 8 "), (4) 

so the uncertainty is a bit smaller than the current HXSWG result. 

Next, we repeat the computation of the NLO envelope, but now with the most up-to- 
date PDF sets: CT10, MSTW08, and NNPDF2.3. Results are also summarized in Tabled 
and lead to the envelope: 

CT g LO = 14.05 ± 0.86 pb, (±6.1% "PDF + a s "). (5) 

so neither the central value nor the uncertainty change significantly. Note that the increase 
in the Higgs cross section using NNPDF2.3, as compared to NNPDF2.0, does not lead to 
an increase of the combined PDF+a s error since the CT10 prediction also increases by a 
similar amount. 

Finally, using the NNLO cross sections from the most updated NNLO PDF sets, but 
otherwise using the same prescription as at NLO, we obtain 

a NNLO = lg 75 ± L24 pb) (q q % „ pDF + as „)_ (g) 

The combined PDF+a s error is thus essentially unchanged when going from NLO to 
NNLO, while the central value is within 2% from the MSTW2008 NNLO value of 18.45 pb, 
which in the 2010 PDF4LHC prescription was taken as the central value. 

These cross sections are plotted in Fig. 1191 and Fig. [20} showing both the cross sections 
from each individual PDF set and the envelope. 

In summary, neither the central value nor the uncertainty on the NLO prediction are 
significantly affected when replacing 2010 PDF with 2012 PDFs, and if the NLO PDF4LHC 
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prescription is also used at NNLO, the combined PDF+a s uncertainty for the Higgs cross 
section moderately rises from 6.1% to 6.6% when going from NLO to NNLO. 

In this respect, the gluon fusion channel with nitf = 125 GeV is an unusually unlucky 
case: for most standard candle processes, as well as for other Higgs production modes, and 
even for gluon fusion, but with other values of the Higgs mass, the uncertainties decrease 
when going from 2010 NLO PDFs to 2012 NNLO PDFs, as it is clear from comparing the 
luminosity plots of Section [2] with analogous plots from previous benchmarks [HE]. 
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Figure 19: The Higgs boson production cross section in the gluon fusion channel using the NLO 
PDF sets included in the PDF4LHC prescription for a s = 0.117 and 0.119. The left plot has been 
computed with 2010 PDFs and the right plot with 2012 PDF sets. The envelope (dashed violet 
horizontal lines) is defined by the upper and lower values of the predictions from all the three PDF 
sets and the two values of a s . The solid violet horizontal line is the midpoint of the envelope. 
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Figure 20: Same as Fig. [HI but using 2012 NNLO PDFs. 

To illustrate this explicitly, we compare in Fig.[2T]predictions for W + boson production 
based on 2010 NLO and 2012 NNLO PDFs from CT, MSTW and NNPDF. The improved 
agreement of the PDF sets is clear: the relative PDF+a s uncertainty, defined with the 
same prescription as for the Higgs cross section, goes down from Apdf+o s = 5.3% to 
Apdf+q s = 3.3%, i.e. from more than twice the MSTW2008 uncertainty (sometimes 
used as a simple approximation to the full envelope) to about 1.5 times the MSTW2008 
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uncertainty. Similar improvements are expected in all quark-initiated cross sections. 
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Figure 21: The W + production cross sections determined using the same PDFs and envelope as 
in Figs. H91I201 The left plot shows 2010 NLO PDFs, the right plot 2012 NNLO PDFs. The recent 
8 TeV CMS measurement is also shown. 
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7 Conclusions and outlook 



In this paper we have presented an updated benchmark comparison of the most recent 
NNLO PDF sets from the ABM, CT, HERAPDF, MSTW and NNPDF collaborations. We 
have compared PDFs, parton luminosities, LHC inclusive cross sections and differential 
distributions, always consistently for a common value of a s . 

Our main result is that the agreement between the most recent CT, MSTW and 
NNPDF NNLO parton distributions is at least as good as it was at NLO, and in many 
cases there is a clear improvement, in that the spread of predictions from different groups 
is reduced significantly. The HERAPDF1.5 NNLO central values are generally in good 
agreement with those of CT, MSTW and NNPDF, but with rather larger uncertainties 
due to the smaller dataset that HERAPDF uses. We find no evidence for tension between 
the HERA-only PDF sets and the PDF sets based on global data sets. It is interesting to 
observe that at NLO the HERAPDF 1.5 set has smaller uncertainty and a more significant 
disagreement with other sets. The improvement in methodology in the HERAPDF1.5 
NNLO analysis seems to not only to enlarge the uncertainty, but also to bring the central 
values more in line with the other sets. 

We find that in several cases ABM11 disagrees with CT, MSTW and NNPDF both for 
PDFs and LHC cross sections, even when a common value of a s is used. For the ABM11 
default a s (Mz) = 0.1134 value, many of these differences with other sets would further 
increase (though the vector boson production predictions would become more similar). 
We have discussed some of the possible explanations of these differences. A plausible 
explanation seems to be the use of the FFN scheme instead of the GM-VFN scheme used 
by the other groups, together with the absence of collider data in the ABM11 fit |30j. 
Other, perhaps less likely explanations, include the presence of higher twist contributions 
in the ABM PDF determination. We have also shown (cf. the end of Sect. 3) that the 8 
TeV LHC data on total inclusive cross sections tend to disfavor ABM11, especially in top 
quark pair production for the default ABM11 a s value, though experimental uncertainties 
are not yet precise enough to allow for a decisive discrimination. 

For Higgs production via gluon-gluon fusion, we have shown that the combined PDF+a s 
uncertainties obtained from the envelope of CT, MSTW and NNPDF sets at NNLO are 
very similar to those obtained at NLO, which in turn are unchanged if 2012 instead of 2010 
PDFs are used. For several other LHC processes (in paticular quark-initiated processes) 
the NNLO combined PDF+a s uncertainty is smaller than the 2010 NLO result. 

Available LHC data is already providing important information on PDFs, and future 
LHC data will provide even more stringent constraints. Such constraints will come from 
more precise measurements of already available processes (such as vector boson production 
and jet production), measurements of new PDF sensitive differential distributions (such 
as low-mass Drell-Yan pair, W+charm, ti, or single-top production), as well as new ways 
of combining the existing data (such as ratios of LHC cross sections at different center-of- 
mass energies [57]). 

Here we have presented only a small subset of all the available plots. A complete 
repository of all available plots is 

http : //nnpdf . hepf orge . org/html/pdf bench/catalog , 

where in particular we provide 
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• Comparisons of PDFs and parton luminosities at NLO and NNLO, for a s (Mz) = 
0.117 and 0.119. 

• Comparisons of PDFs at a low scale of 2 GeV 2 , and as ratios with respect to a 
reference set for an LHC scale of 10 4 GeV 2 . 

• Comparison of PDFs to all the relevant LHC data from ATLAS, CMS and LHCb 
at NNLO, for a s (M z ) = 0.117 and 0.119. 

• PDF dependence of benchmark cross sections. 
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A Definitions of x 2 



The value of the x 2 estimator depends on the assumed functional form for x 2 m the pres- 
ence of experimental correlated systematic uncertainties. In this appendix, we document 
the various definitions of the x 2 function adopted in this paper and the numerical inputs 
that were used to obtain our results. 

Statistical experimental errors are usually reported in the form of a list containing 
their absolute values, while for systematic errors the list gives relative values expressed as 
percentages of the central value. Often the systematic errors are asymmetric, i.e. they 
have different positive and negative deviations. The covariance matrix (cov)jj is calculated 
from this published information by following one of the methods described below. Needless 
to say it is important, when benchmarking the various PDF predictions, to state precisely 
how the covariance matrix was computed. On the other hand some experiments directly 
provide the covariance matrix rather than the list of systematic errors, and in this case no 
ambiguity is possible. 

A.l Definitions of x 2 with the covariance matrix 

We can define the x 2 f° r a specific experiment with N p t data points by 

N pt 

X 2 = - A)(cov _1 )ij(Tj - Dj), (7) 

and use it as a figure of merit to judge the agreement between theory and data. The 
covariance matrix (cov)y used in this definition may be written as 

(covk = ^ + £ + £ ag*$ DiDj, (8) 

\a=l a=l / 

where i and j run over the experimental points (i,j = 1, N pt ), Di are the measured 

central values, and Tj the corresponding theoretical predictions computed with a given set 

of PDFs. This covariance matrix depends on uncorrelated uncertainties Sj, constructed 

by adding the statistical and uncorrelated systematic uncertainties in quadrature; Nc 

(c) 

multiplicative normalization uncertainties, a ia ; and N c other correlated systematic un- 
certainties, expressed for convenience in the above equation in terms of their relative values 

fcl 

o\ a . The total number of correlated uncertainties is thus N\ = Nc + N c . Asymmetric 
systematic uncertainties provided by the experiments must be symmetrized to use this 
expression. We symmetrize them by averaging, af^ = + af^ ). 

Note that it is important when fitting to distinguish between additive uncertainties 
(where the experimentalists have determined a absolute shift in the observable due to a 
systematic uncertainty) and multiplicative uncertainties (where the experimentalists have 
determined a relative shift, as a fraction of the measured observable). In particular it is 
important not to mistake an additive uncertainty for a multiplicative one just because it is 
presented multiplicatively (as are the correlated systematics in Eq. (JSj) , where the absolute 
shift in data point i from systematic uncertainty a is written as a^Di). Correlated 
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systematics which are truly multiplicative should of course be treated in the same way as 
the normalization uncertainty. 

This distinction is important because if Eq. ([5]) were used as a figure of merit in an 
actual PDF fit, it would result in a D'Agostini bias of the multiplicative uncertainties |81| . 
However it is a suitable objective criteria for comparing a posteriori the various predictions 
from the different PDF sets that are discussed here, and we have used it as such throughout 
the body of this paper. 

An alternative definition of the covariance matrix is the to-prescription [ST], where a 
fixed theory prediction (e.g., the final theory prediction from a previous fit) is used 
to define the normalization contribution to the \ 2 ■ In the to-prescription the covariance 
matrix is thus 

N c N C 

(cov), = S ijS 2 + £ a^D lDj + £ a^T^Tf. (9) 

This definition has the advantage of avoiding the D'Agostini bias from multiplicative 
normalization uncertainties when performing a PDF fit. 

When the breakdown into additive and multiplicative uncertainties is not provided 
by the experiment, one may use to compute all systematic uncertainties, to give an 
'extended-to' prescription: 

/ N c N C \ 

(cov) y = 8,4 + + £« ) ?f >?f. (10) 

\a=l a=\ J 

This prescription rescales by all multiplicative uncertainties (associated with the nor- 
malization or not), but also modifies the additive uncertainties given by the experiment 
in a mild way consistent with their overall uncertainty. We will see below that the to co- 
variance matrix Eq. ([9]) and the extended-to covariance matrix Eq. (HOP generally produce 
lower x 2 values than the experimental definition in Eqs. ([8]) for datasets with substantial 
systematic uncertainties. 

In summary, we consider in this appendix three possible definitions of the covariance 
matrix: 



(cov)y 


= hjs 2 


+ 


/ N c N c \ 

£*&$ + £*£M2 )D i D il "Exp" 

\a=l a=l ) 


(cov)ij 


- 5-s 2 


N c N C 
a=l a=l 


(cov)ij 




+ 


/ N c N c \ 

E + E ^Tf , "Extended - t " 

\a=l a=l ) 
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A. 2 Definitions of x 2 with shift parameters 

An alternative, yet numerically equivalent, representation for the x 2 function has been used 
in the jet benchmarking exercise of Sec. [5J following the method traditionally adopted in 
the CTEQ and MSTW PDF fits for jet and some other data sets. In this representation, 
the x 2 figure of merit for goodness-of-fit to an experiment with correlated systematic 
uncertainties is expressed as [77] 



X 2 (W,{A}) 



Xd + Xx, 



nil 



where 



and 



Xd 



y~] — I Dk - Tk — /3fc, a A Q J , 

k=l Sfc V a=l J 



N x 



Xx 



^2 A «> 



(12) 



(13) 



a=l 



using the same notation as in the previous section, where the /3k a are the absolute corre- 
lated uncertainties. Systematic uncertainties associated with N\ sources may now induce 
correlated variations (shifts) in the experimental data points. Their effect is approximated 
by including a sum ^2 a h,aX a dependent on the correlation matrix (3k, a (k = l,...,N pt ; 
a = 1,...,N\) and stochastic nuisance parameters X a , with one nuisance parameter as- 
signed to every source of the systematic uncertainty. By a common assumption, each 
A a follows the standard normal distribution. Its deviation from X a = incurs a penalty 
contribution A^ to \ '■ Under this assumption the minimum of x 2 with respect to X a can 
be found algebraically, since the dependence on X a is quadratic [77]. 

We can solve for the best-fit values Aq q of the nuisance parameters to find 



\0q 



E- 

i=i 



8=1 



Si 



(14) 



with 



A 



a/3 



la/3 



N p t 



(15) 



When these Ao Q values are substituted into Eq. (|T3|) . one obtains the usual expression 
Eq. © for the x 2 , with 



(cov)-- 1 



<5 



Nx 



e 2 Z_> «2 ^0/3 „2 



(16) 



the inverse of 



Nx 
a=l 



(17) 
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If the absolute correlation is related to the relative correlation ai <a by multiplying 
by the experimental central values for both a^J and , 

Pi,a = &i,aDi, (18) 

the expression in Eq. (|1T|) coincides with the covariance matrix introduced earlier in 
Eq. ([5D. It is equivalent to the usual definition Eq. (jHJ), but also contains explicit in- 
formation about the values of the systematic parameters Ao Q at the best fit. 
If instead of Eq. (fTHj) we set 

ft,a = ^ > a^ ( ° ) , (19) 

we recover the extended-to X 2 m Eq. (flU]) . Finally, using Eq. (USD to find a® anc j Eq. (HD 
to find , we recover the to definition in Eq. ([9]) . Thus the x 2 values in the shift method 
as defined here are entirely equivalent to the methods based on direct inversion of the 
covariance matrix in Sec. IA.1I 



A. 3 Impact on ATLAS jet cross sections 

Numerical comparisons of the different x 2 prescriptions will depend on the exact procedure 
used to determine Sj and cjj jQ ,. For example, in the comparisons to the ATLAS jet data in 
Sec. we compute fik,a using Eq. (fl~8l) (equivalent to Eq. (jHJ)), averaging any asymmetric 
errors. Given the large number of independent systematic parameters (N\ = 88), the 
asymmetry of some nuisance parameters is not expected to significantly bias the resulting 
PDFs, which has been confirmed by computing the x 2 tables using the same x 2 definition, 
but following alternative error symmetrization procedures. In all cases examined, the 
choice of the symmetrization procedure had a smaller effect on x 2 f° r the ATLAS jet data 
than the choice of the x 2 definition. 

We have also checked numerically that the covariance matrix definitions described in 
Sec. A.l and the corresponding shift definitions described in Sec. A. 2 give the same results 
when implemented numerically (as they should). Thus for the remainder of this section we 
will focus on the difference between the three definitions of the covariance matrix described 
in Sec. A.l. 

In Table [TOj we compare the default 'experimental' definition of the covariance matrix 
used in the paper (cf. Eq. (jSJ)) and the to definition of Eq. ([9]). In this case, recent LHC 
measurements for W, Z, and jet production are compared to NLO predictions with five 
PDF sets and a s = 0.119. Results at NNLO and for other values of the strong coupling are 
qualitatively similar. One can see that the to definition leads to smaller numerical values 
of x 2 f° r an PDF sets considered, especially in experiments with sizable normalization 
contributions, though it is also clear that the qualitative comparison between PDF sets in 
Sect. 2] is not affected by this alternative definition. 

Similarly, the experimental definition is compared with the extended-to definition in 
the case of ATLAS jet production with R = 0.4 in Table [TT1 The comparisons are made for 
the NLO PDF sets, a s values, and computer codes specified in the table. Three columns 
of X 2 /N p t are shown, corresponding to the 'experimental' definition realized according to 
Eqs. (fl~7|) and (fl~8l) in column 1; and the extended-to definition based on Eqs. (fTTl) and (fl"9j) . 
with the reference cross sections found using the central CT10 NLO in column 2 and 
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NLO PDFs, a s 


= 0.119 






Dataset 


NNPDF2.3 


MSTW08 


CT10 


ABM11 


HERAPDF1.5 


ATLAS W, Z (Exp) 


1.268 


2.004 


1.062 


1.558 


1.747 


ATLAS W, Z (t ) 


1.292 


2.024 


1.026 


1.487 


1.676 


CMS W el asy (Exp) 


0.820 


4.690 


1.419 


1.915 


0.687 


CMS W el asy (t ) 


0.820 


4.690 


1.419 


1.915 


0.687 


LHCb W (Exp) 


0.670 


0.907 


1.064 


2.328 


4.125 


LHCb W (t ) 


0.662 


0.896 


1.046 


2.298 


4.100 


ATLAS jets (Exp) 


0.999 


0.974 


1.350 


1.342 


1.106 


ATLAS jets (i ) 


0.836 


0.825 


1.234 


1.317 


1.032 



Table 10: The xV-^pt values for the available LHC data with published correlated uncertainties, 
computed using the five PDF sets considered. The experimental ("Exp") definition of (cov).y in 
Eq. ([SJ is compared to the to definition in Eq. © . Theoretical predictions have been computed at 
NLO with APPLgrid for a common value of the strong coupling a s (Mz) = 0.119. 



NLO PDF 




Code 


(cov)jj definition 








Exp 


Ext. t Q 
CT10 


Ext. t Q 
NNPDF2.3 


CT10 


0.118 


FastNLO 


0.95 


0.55 


0.60 


CT10 


0.118 


MEKS1 


1.00 


0.57 


0.61 


CT10 


0.118 


MEKS2 


0.89 


0.55 


0.59 


NNPDF2.3 


0.119 


FastNLO 


0.87 


0.60 


0.57 


NNPDF2.3 


0.119 


MEKS1 


0.90 


0.58 


0.55 


NNPDF2.3 


0.119 


MEKS2 


0.78 


0.54 


0.53 


NNPDF2.3 


0.119 


APPLgrid 


1.00 


0.64 


0.62 



Table 11: The xV-^pt values for the ATLAS inclusive jet production data obtained with the 
experimental and extended-io definitions of the x 2 function. The cross sections are computed at 
NLO using the specified NLO PDFs, a s values, and the following codes: FastNLO, MEKS with fiF,R 
equal to the individual jet p T (MEKS1) or p T of the hardest jet (MEKS2), and APPLgrid. 

NNPDF2.3 NLO PDFs in column 3@ In this case, the the x 2 /-^pt values in columns 2 
and 3 are noticeably lower than in column 1. They are not exactly the same in columns 
2 and 3, indicating that x 2 a ls° depends to some extent on the PDF that was used to 
compute T^ ) . However this difference is much smaller than the difference between results 
using different codes, or different scale choices. 

The comparisons of the three covariance matrix definitions in the two tables indicate 
that, for the ATLAS jet data, the difference in the corresponding x 2 values is quite large. 
Note that in this comparison, the to covariance matrix treats only the normalization of 
these data as multiplicative, whereas the extended-to treats all systematic uncertainties 
as multiplicative. Hence, it is always important to know when performing a fit whether a 
correlated error as determined by the experimentalists is multiplicative (hence, susceptible 
to the d'Agostini bias) or additive, since this will affect the impact of that data on the fit. 



9 The "exp" NNPDF2.3 entry with APPLgrid in this table is numerically equivalent to the corresponding 
: exp" entry in the next-to-the last row of Table 1101 
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